Integrating Data from The Preposition Project into FrameNet

نویسنده

  • Kenneth C. Litkowski
چکیده

In the course of The Preposition Project, FrameNet sentences are used as instances to characteize preposition behavior. FrameNet sentences containing prepositional phrases beginning with a given preposition are presented to a lexicographer, whereupon the given preposition is tagged with a sense from a sense inventory derived from the Oxford Dictionary of English. For the 34 prepositions used in the SemEval-2007 task on preposition disambiguation, over 25,000 instances were so tagged. Since each instance, i.e., a prepositional phrase, had already been tagged by the FrameNet lexicographers with a frame element for a particular frame, each sense in The Preposition Project has an associated set of (Frame, FrameElement) pairs. For the 25,641 instances, 18,470 instances are FrameNet core frame elements, 4,518 are FrameNet peripheral frame elements, and 1,676 are FrameNet extra-thematic frame elements. Since, in general, prepositions are not targets for the FrameNet project, data from The Preposition Project provide an opportunity to make a systematic attempt to expand the treatment of prepositions in FrameNet. This paper describes various analyses upon data from The Preposition Project and FrameNet to explore and develop steps that can be taken.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Semantic Role Resources for Preposition Disambiguation

This article describes how semantic role resources can be exploited for preposition disambiguation. The main resources include the semantic role annotations provided by the Penn Treebank and FrameNet tagged corpora. The resources also include the assertions contained in the Factotum knowledge base, as well as information from Cyc and Conceptual Graphs. A common inventory is derived from these i...

متن کامل

The Preposition Project

Prepositions are an important vehicle for indicating semantic roles. Their meanings are difficult to analyze and they are often discarded in processing text. The Preposition Project is designed to provide a comprehensive database of preposition senses suitable for use in natural language processing applications. In the project, prepositions in the FrameNet corpus are disambiguated using a sense...

متن کامل

Pattern Dictionary of English Prepositions

We present a new lexical resource for the study of preposition behavior, the Pattern Dictionary of English Prepositions (PDEP). This dictionary, which follows principles laid out in Hanks’ theory of norms and exploitations, is linked to 81,509 sentences for 304 prepositions, which have been made available under The Preposition Project (TPP). Notably, 47,285 sentences, initially untagged, provid...

متن کامل

Preposition Disambiguation: Still a Problem

Considerable recent progress has been made in preposition disambiguation using the SemEval 2007 corpus, with results reaching accuracy of over 88 percent. However, with a new corpus of tagged instances, use of the models shows a decline in performance to around 43 percent. This suggests that recent efforts suffer from an out-of-domain problem. Detailed examination of the dimensions of this prob...

متن کامل

Preposition Semantic Classification via Treebank and FrameNet

This paper reports on experiments in classifying the semantic role annotations assigned to prepositional phrases in both PENN TREEBANK (version II) and FRAMENET (version 0.75). In both cases, experiments are done to see how the prepositions can be classified given the dataset’s role inventory, using standard word-sense disambiguation features, such as the parts of speech of surrounding words, a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007